A Practical Approach for Few Shot Learning with SetFit for Scaling Up Search and Relevance Ranking on a Large Text Database

Fernando Vieira da Silva • Location: TUECHTIG • Back to Haystack EU 2023

SetFit (Sentence Transformer Fine Tuning) is a recently proposed few-shot learning technique that has achieved state-of-the-art results for multiple classification problems in label-scarce settings, even outperforming GPT-3 in many cases. As for learning to rank, SetFit may be especially important when there are fewer training samples available.

In our case study, we collected a small dataset in the legal research domain, consisting of real-world search queries along with relevant and irrelevant results, manually annotated by lawyers or law students.

After that, we trained a model using the SetFit technique and we generated word embeddings for a larger dataset, for semantic searching. We also trained a ranking model using SetFit and compared results with other approaches for the same language and the legal domain.

In this talk, we present SetFit and its ranking application and we discuss the results of our experiments.

Download the Slides Watch the Video

Fernando Vieira da Silva

N2VEC

Fernando is the CEO of N2VEC, a startup that develops a search engine API for enterprise documents. He also has a Ph.D. in Computer Science with a focus on Natural Language Processing and experience working with search, relevance ranking, NER and text classification.